Search CORE

University of Illinois at Chicago: Journals@UIC

MLTrends: Graphing MEDLINE term usage over time

Author: Andrade-Navarro Miguel A
Palidwor Gareth A
Publication venue: University of Illinois at Chicago Library
Publication date: 22/01/2010
Field of study

The MEDLINE database of medical literature is routinely used by researchers and doctors to find articles pertaining to their area of interest. Insight into historical changes in research areas and use of scientific language may be gained by chronological analysis of the 18 million records currently in the database, however such analysis is generally complex and time consuming. The authors’ MLTrends web application graphs term usage in MEDLINE over time, allowing the determination of emergence dates for biomedical terms and historical variations in term usage intensity. Terms considered are individual words or quoted phrases which may be combined using Boolean operators. MLTrends can plot the number of records in MEDLINE per year whose titles or abstracts match each queried term for multiple terms simultaneously. The MEDLINE database is stored and indexed on the MLTrends server allowing queries to be completed and graphs generated in less than one second. Queries may be performed on all titles and/or abstracts in MEDLINE and can include stop words. The resulting graphs may be normalized by total publications or words per year to facilitate term usage comparison between years. This makes MLTrends a powerful tool for rapid evaluation of the evolution of biomedical research and language in a graphical way. MLTrends may be used at: http://www.ogic.ca/mltrend

K2D2: Estimation of protein secondary structure from circular dichroism spectra

Author: Andrade-Navarro Miguel A
Perez-Iratxeta Carolina
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Circular dichroism spectroscopy is a widely used technique to analyze the secondary structure of proteins in solution. Predictive methods use the circular dichroism spectra from proteins of known tertiary structure to assess the secondary structure contents of a protein with unknown structure given its circular dichroism spectrum. Results We developed K2D2, a method with an associated web server to estimate protein secondary structure from circular dichroism spectra. The method uses a self-organized map of spectra from proteins with known structure to deduce a map of protein secondary structure that is used to do the predictions. Conclusion The K2D2 server is publicly accessible at <url>http://www.ogic.ca/projects/k2d2/</url>. It accepts as input a circular dichroism spectrum and outputs the estimated secondary structure content (alpha-helix and beta-strand) of the corresponding protein, as well as an estimated measure of error.</p

Pseudogenes as an alternative source of natural antisense transcripts

Author: Andrade-Navarro Miguel A
Muro Enrique M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Naturally occurring antisense transcripts (NATs) are non-coding RNAs that may regulate the activity of sense transcripts to which they bind because of complementarity. NATs that are not located in the gene they regulate (trans-NATs) have better chances to evolve than cis-NATs, which is evident when the sense strand of the cis-NAT is part of a protein coding gene. However, the generation of a trans-NAT requires the formation of a relatively large region of complementarity to the gene it regulates. Results Pseudogene formation may be one evolutionary mechanism that generates trans-NATs to the parental gene. For example, this could occur if the parental gene is regulated by a cis-NAT that is copied as a trans-NAT in the pseudogene. To support this we identified human pseudogenes with a trans-NAT to the parental gene in their antisense strand by analysis of the database of expressed sequence tags (ESTs). We found that the mutations that appeared in these trans-NATs after the pseudogene formation do not show the flat distribution that would be expected in a non functional transcript. Instead, we found higher similarity to the parental gene in a region nearby the 3' end of the trans-NATs. Conclusions Our results do not imply a functional relation of the trans-NAT arising from pseudogenes over their respective parental genes but add evidence for it and stress the importance of duplication mechanisms of genetic material in the generation of non-coding RNAs. We also provide a plausible explanation for the large transcripts that can be found in the antisense strand of some pseudogenes.</p

Identification of novel stem cell markers using gap analysis of gene expression data

Author: Andrade-Navarro Miguel A
Krzyzanowski Paul M
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

A method for the detection of marker genes in large heterogeneous collections of gene expression data is described and applied to DNA microarray data generated from 83 mouse stem cell-related samples

Elsevier - Publisher Connector

Comparison of inter- and intraspecies variation in humans and fruit flies

Author: Andrade-Navarro Miguel A.
Hodge Russ
Shih Juliann
Publication venue: The Authors. Published by Elsevier Inc.
Publication date: 22/11/2014
Field of study

AbstractVariation is essential to species survival and adaptation during evolution. This variation is conferred by the imperfection of biochemical processes, such as mutations and alterations in DNA sequences, and can also be seen within genomes through processes such as the generation of antibodies. Recent sequencing projects have produced multiple versions of the genomes of humans and fruit flies (Drosophila melanogaster). These give us a chance to study how individual gene sequences vary within and between species. Here we arranged human and fly genes in orthologous pairs and compared such within-species variability with their degree of conservation between flies and humans. We observed that a significant number of proteins associated with mRNA translation are highly conserved between species and yet are highly variable within each species. The fact that we observe this in two species whose lineages separated more than 700million years ago suggests that this is the result of a very ancient process. We hypothesize that this effect might be attributed to a positive selection for variability of virus-interacting proteins that confers a general resistance to viral hijacking of the mRNA translation machinery within populations. Our analysis points to this and to other processes resulting in positive selection for gene variation

Armadillo Motifs Involved in Vesicular Transport

Author: Andrade-Navarro Miguel A.
Heinemann Udo
Striegl Harald
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Armadillo (ARM) repeat proteins function in various cellular processes including vesicular transport and membrane tethering. They contain an imperfect repeating sequence motif that forms a conserved three-dimensional structure. Recently, structural and functional insight into tethering mediated by the ARM-repeat protein p115 has been provided. Here we describe the p115 ARM-motifs for reasons of clarity and nomenclature and show that both sequence and structure are highly conserved among ARM-repeat proteins. We argue that there is no need to invoke repeat types other than ARM repeats for a proper description of the structure of the p115 globular head region. Additionally, we propose to define a new subfamily of ARM-like proteins and show lack of evidence that the ARM motifs found in p115 are present in other long coiled-coil tethering factors of the golgin family

CiteSeerX

Manifold learning and maximum likelihood estimation for hyperbolic network embedding

Author: Gregorio Alanis-Lobato
Miguel A. Andrade-Navarro
Pablo Mier
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Linking genes to diseases: it's all in the data

Author: Andrade-Navarro Miguel A
Perez-Iratxeta Carolina
Tiffin Nicki
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Genome-wide association analyses on large patient cohorts are generating large sets of candidate disease genes. This is coupled with the availability of ever-increasing genomic databases and a rapidly expanding repository of biomedical literature. Computational approaches to disease-gene association attempt to harness these data sources to identify the most likely disease gene candidates for further empirical analysis by translational researchers, resulting in efficient identification of genes of diagnostic, prognostic and therapeutic value. Existing computational methods analyze gene structure and sequence, functional annotation of candidate genes, characteristics of known disease genes, gene regulatory networks, protein-protein interactions, data from animal models and disease phenotype. To date, a few studies have successfully applied computational analysis of clinical phenotype data for specific diseases and shown genetic associations. In the near future, computational strategies will be facilitated by improved integration of clinical and computational research, and by increased availability of clinical phenotype data in a format accessible to computational approaches

Taxonomic colouring of phylogenetic trees of protein sequences

Author: Andrade-Navarro Miguel A
Palidwor Gareth
Reynaud Emmanuel G
Publication venue: BioMed Central
Publication date: 01/02/2006
Field of study

BACKGROUND: Phylogenetic analyses of protein families are used to define the evolutionary relationships between homologous proteins. The interpretation of protein-sequence phylogenetic trees requires the examination of the taxonomic properties of the species associated to those sequences. However, there is no online tool to facilitate this interpretation, for example, by automatically attaching taxonomic information to the nodes of a tree, or by interactively colouring the branches of a tree according to any combination of taxonomic divisions. This is especially problematic if the tree contains on the order of hundreds of sequences, which, given the accelerated increase in the size of the protein sequence databases, is a situation that is becoming common. RESULTS: We have developed PhyloView, a web based tool for colouring phylogenetic trees upon arbitrary taxonomic properties of the species represented in a protein sequence phylogenetic tree. Provided that the tree contains SwissProt, SpTrembl, or GenBank protein identifiers, the tool retrieves the taxonomic information from the corresponding database. A colour picker displays a summary of the findings and allows the user to associate colours to the leaves of the tree according to any number of taxonomic partitions. Then, the colours are propagated to the branches of the tree. CONCLUSION: PhyloView can be used at . A tutorial, the software with documentation, and GPL licensed source code, can be accessed at the same web address

Research Repository UCD